Car Detection in aerial imagery

After a massive research I chose to work with Detectron2 object detection model of Meta.

I took the following steps:

  • Install the model dependencies
  • Download my custom dataset
  • Write our model training configuration
  • Run the model training
  • Evaluate the model performance
  • Run the model inference on test images
  • Convert image to green
  • Paint bboxes red
  • Execute the desired algorithm

Install Dependencies

In [ ]:
# install dependencies: (use cu101 because colab has CUDA 10.1)
!pip install -U torch==1.5 torchvision==0.6 -f https://download.pytorch.org/whl/cu101/torch_stable.html 
!pip install cython pyyaml==5.1
!pip install -U 'git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI'
import torch, torchvision
print(torch.__version__, torch.cuda.is_available())
!gcc --version
# install detectron2:
!pip install detectron2==0.1.3 -f https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.5/index.html
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in links: https://download.pytorch.org/whl/cu101/torch_stable.html
Collecting torch==1.5
  Downloading https://download.pytorch.org/whl/cu101/torch-1.5.0%2Bcu101-cp37-cp37m-linux_x86_64.whl (703.8 MB)
     |████████████████████████████████| 703.8 MB 20 kB/s 
Collecting torchvision==0.6
  Downloading https://download.pytorch.org/whl/cu101/torchvision-0.6.0%2Bcu101-cp37-cp37m-linux_x86_64.whl (6.6 MB)
     |████████████████████████████████| 6.6 MB 1.3 MB/s 
Requirement already satisfied: future in /usr/local/lib/python3.7/dist-packages (from torch==1.5) (0.16.0)
Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from torch==1.5) (1.21.6)
Requirement already satisfied: pillow>=4.1.1 in /usr/local/lib/python3.7/dist-packages (from torchvision==0.6) (7.1.2)
Installing collected packages: torch, torchvision
  Attempting uninstall: torch
    Found existing installation: torch 1.12.1+cu113
    Uninstalling torch-1.12.1+cu113:
      Successfully uninstalled torch-1.12.1+cu113
  Attempting uninstall: torchvision
    Found existing installation: torchvision 0.13.1+cu113
    Uninstalling torchvision-0.13.1+cu113:
      Successfully uninstalled torchvision-0.13.1+cu113
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
torchtext 0.13.1 requires torch==1.12.1, but you have torch 1.5.0+cu101 which is incompatible.
torchaudio 0.12.1+cu113 requires torch==1.12.1, but you have torch 1.5.0+cu101 which is incompatible.
fastai 2.7.9 requires torch<1.14,>=1.7, but you have torch 1.5.0+cu101 which is incompatible.
fastai 2.7.9 requires torchvision>=0.8.2, but you have torchvision 0.6.0+cu101 which is incompatible.
Successfully installed torch-1.5.0+cu101 torchvision-0.6.0+cu101
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Requirement already satisfied: cython in /usr/local/lib/python3.7/dist-packages (0.29.32)
Collecting pyyaml==5.1
  Downloading PyYAML-5.1.tar.gz (274 kB)
     |████████████████████████████████| 274 kB 2.1 MB/s 
Building wheels for collected packages: pyyaml
  Building wheel for pyyaml (setup.py) ... done
  Created wheel for pyyaml: filename=PyYAML-5.1-cp37-cp37m-linux_x86_64.whl size=44092 sha256=629d2ac28b2aabdf6da382a0e6455ae18763d5114eb8c6d16dc87a7ceb398ca9
  Stored in directory: /root/.cache/pip/wheels/77/f5/10/d00a2bd30928b972790053b5de0c703ca87324f3fead0f2fd9
Successfully built pyyaml
Installing collected packages: pyyaml
  Attempting uninstall: pyyaml
    Found existing installation: PyYAML 6.0
    Uninstalling PyYAML-6.0:
      Successfully uninstalled PyYAML-6.0
ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
fastai 2.7.9 requires torch<1.14,>=1.7, but you have torch 1.5.0+cu101 which is incompatible.
fastai 2.7.9 requires torchvision>=0.8.2, but you have torchvision 0.6.0+cu101 which is incompatible.
dask 2022.2.0 requires pyyaml>=5.3.1, but you have pyyaml 5.1 which is incompatible.
Successfully installed pyyaml-5.1
Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting git+https://github.com/cocodataset/cocoapi.git#subdirectory=PythonAPI
  Cloning https://github.com/cocodataset/cocoapi.git to /tmp/pip-req-build-fwarmm91
  Running command git clone -q https://github.com/cocodataset/cocoapi.git /tmp/pip-req-build-fwarmm91
Requirement already satisfied: setuptools>=18.0 in /usr/local/lib/python3.7/dist-packages (from pycocotools==2.0) (57.4.0)
Requirement already satisfied: cython>=0.27.3 in /usr/local/lib/python3.7/dist-packages (from pycocotools==2.0) (0.29.32)
Requirement already satisfied: matplotlib>=2.1.0 in /usr/local/lib/python3.7/dist-packages (from pycocotools==2.0) (3.2.2)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.1.0->pycocotools==2.0) (0.11.0)
Requirement already satisfied: numpy>=1.11 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.1.0->pycocotools==2.0) (1.21.6)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.1.0->pycocotools==2.0) (1.4.4)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.1.0->pycocotools==2.0) (3.0.9)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib>=2.1.0->pycocotools==2.0) (2.8.2)
Requirement already satisfied: typing-extensions in /usr/local/lib/python3.7/dist-packages (from kiwisolver>=1.0.1->matplotlib>=2.1.0->pycocotools==2.0) (4.1.1)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib>=2.1.0->pycocotools==2.0) (1.15.0)
Building wheels for collected packages: pycocotools
  Building wheel for pycocotools (setup.py) ... done
  Created wheel for pycocotools: filename=pycocotools-2.0-cp37-cp37m-linux_x86_64.whl size=265163 sha256=760ad144520e45f46cece7934bb8a986cb018813ce830d8bb9aa992b86a21558
  Stored in directory: /tmp/pip-ephem-wheel-cache-lulpwcul/wheels/e2/6b/1d/344ac773c7495ea0b85eb228bc66daec7400a143a92d36b7b1
Successfully built pycocotools
Installing collected packages: pycocotools
  Attempting uninstall: pycocotools
    Found existing installation: pycocotools 2.0.4
    Uninstalling pycocotools-2.0.4:
      Successfully uninstalled pycocotools-2.0.4
Successfully installed pycocotools-2.0
1.5.0+cu101 True
gcc (Ubuntu 7.5.0-3ubuntu1~18.04) 7.5.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Looking in links: https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.5/index.html
Collecting detectron2==0.1.3
  Downloading https://dl.fbaipublicfiles.com/detectron2/wheels/cu101/torch1.5/detectron2-0.1.3%2Bcu101-cp37-cp37m-linux_x86_64.whl (6.2 MB)
     |████████████████████████████████| 6.2 MB 1.5 MB/s 
Requirement already satisfied: future in /usr/local/lib/python3.7/dist-packages (from detectron2==0.1.3) (0.16.0)
Requirement already satisfied: tqdm>4.29.0 in /usr/local/lib/python3.7/dist-packages (from detectron2==0.1.3) (4.64.1)
Requirement already satisfied: Pillow in /usr/local/lib/python3.7/dist-packages (from detectron2==0.1.3) (7.1.2)
Requirement already satisfied: tensorboard in /usr/local/lib/python3.7/dist-packages (from detectron2==0.1.3) (2.8.0)
Requirement already satisfied: pydot in /usr/local/lib/python3.7/dist-packages (from detectron2==0.1.3) (1.3.0)
Requirement already satisfied: tabulate in /usr/local/lib/python3.7/dist-packages (from detectron2==0.1.3) (0.8.10)
Requirement already satisfied: termcolor>=1.1 in /usr/local/lib/python3.7/dist-packages (from detectron2==0.1.3) (1.1.0)
Collecting yacs>=0.1.6
  Downloading yacs-0.1.8-py3-none-any.whl (14 kB)
Requirement already satisfied: matplotlib in /usr/local/lib/python3.7/dist-packages (from detectron2==0.1.3) (3.2.2)
Collecting fvcore>=0.1.1
  Downloading fvcore-0.1.5.post20220512.tar.gz (50 kB)
     |████████████████████████████████| 50 kB 1.6 MB/s 
Collecting mock
  Downloading mock-4.0.3-py3-none-any.whl (28 kB)
Requirement already satisfied: cloudpickle in /usr/local/lib/python3.7/dist-packages (from detectron2==0.1.3) (1.5.0)
Requirement already satisfied: numpy in /usr/local/lib/python3.7/dist-packages (from fvcore>=0.1.1->detectron2==0.1.3) (1.21.6)
Requirement already satisfied: pyyaml>=5.1 in /usr/local/lib/python3.7/dist-packages (from fvcore>=0.1.1->detectron2==0.1.3) (5.1)
Collecting iopath>=0.1.7
  Downloading iopath-0.1.10.tar.gz (42 kB)
     |████████████████████████████████| 42 kB 1.0 MB/s 
Requirement already satisfied: typing_extensions in /usr/local/lib/python3.7/dist-packages (from iopath>=0.1.7->fvcore>=0.1.1->detectron2==0.1.3) (4.1.1)
Collecting portalocker
  Downloading portalocker-2.5.1-py2.py3-none-any.whl (15 kB)
Requirement already satisfied: kiwisolver>=1.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->detectron2==0.1.3) (1.4.4)
Requirement already satisfied: cycler>=0.10 in /usr/local/lib/python3.7/dist-packages (from matplotlib->detectron2==0.1.3) (0.11.0)
Requirement already satisfied: pyparsing!=2.0.4,!=2.1.2,!=2.1.6,>=2.0.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->detectron2==0.1.3) (3.0.9)
Requirement already satisfied: python-dateutil>=2.1 in /usr/local/lib/python3.7/dist-packages (from matplotlib->detectron2==0.1.3) (2.8.2)
Requirement already satisfied: six>=1.5 in /usr/local/lib/python3.7/dist-packages (from python-dateutil>=2.1->matplotlib->detectron2==0.1.3) (1.15.0)
Requirement already satisfied: protobuf>=3.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.1.3) (3.17.3)
Requirement already satisfied: werkzeug>=0.11.15 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.1.3) (1.0.1)
Requirement already satisfied: grpcio>=1.24.3 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.1.3) (1.48.1)
Requirement already satisfied: requests<3,>=2.21.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.1.3) (2.23.0)
Requirement already satisfied: tensorboard-plugin-wit>=1.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.1.3) (1.8.1)
Requirement already satisfied: tensorboard-data-server<0.7.0,>=0.6.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.1.3) (0.6.1)
Requirement already satisfied: absl-py>=0.4 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.1.3) (1.2.0)
Requirement already satisfied: google-auth-oauthlib<0.5,>=0.4.1 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.1.3) (0.4.6)
Requirement already satisfied: google-auth<3,>=1.6.3 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.1.3) (1.35.0)
Requirement already satisfied: wheel>=0.26 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.1.3) (0.37.1)
Requirement already satisfied: markdown>=2.6.8 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.1.3) (3.4.1)
Requirement already satisfied: setuptools>=41.0.0 in /usr/local/lib/python3.7/dist-packages (from tensorboard->detectron2==0.1.3) (57.4.0)
Requirement already satisfied: cachetools<5.0,>=2.0.0 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard->detectron2==0.1.3) (4.2.4)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard->detectron2==0.1.3) (0.2.8)
Requirement already satisfied: rsa<5,>=3.1.4 in /usr/local/lib/python3.7/dist-packages (from google-auth<3,>=1.6.3->tensorboard->detectron2==0.1.3) (4.9)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /usr/local/lib/python3.7/dist-packages (from google-auth-oauthlib<0.5,>=0.4.1->tensorboard->detectron2==0.1.3) (1.3.1)
Requirement already satisfied: importlib-metadata>=4.4 in /usr/local/lib/python3.7/dist-packages (from markdown>=2.6.8->tensorboard->detectron2==0.1.3) (4.12.0)
Requirement already satisfied: zipp>=0.5 in /usr/local/lib/python3.7/dist-packages (from importlib-metadata>=4.4->markdown>=2.6.8->tensorboard->detectron2==0.1.3) (3.8.1)
Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /usr/local/lib/python3.7/dist-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard->detectron2==0.1.3) (0.4.8)
Requirement already satisfied: idna<3,>=2.5 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard->detectron2==0.1.3) (2.10)
Requirement already satisfied: chardet<4,>=3.0.2 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard->detectron2==0.1.3) (3.0.4)
Requirement already satisfied: certifi>=2017.4.17 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard->detectron2==0.1.3) (2022.6.15)
Requirement already satisfied: urllib3!=1.25.0,!=1.25.1,<1.26,>=1.21.1 in /usr/local/lib/python3.7/dist-packages (from requests<3,>=2.21.0->tensorboard->detectron2==0.1.3) (1.24.3)
Requirement already satisfied: oauthlib>=3.0.0 in /usr/local/lib/python3.7/dist-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<0.5,>=0.4.1->tensorboard->detectron2==0.1.3) (3.2.0)
Building wheels for collected packages: fvcore, iopath
  Building wheel for fvcore (setup.py) ... done
  Created wheel for fvcore: filename=fvcore-0.1.5.post20220512-py3-none-any.whl size=61288 sha256=b40b4e9c0739515556fb31939b628f8f8c3b6d0a803164e6fe99045edfbb3f35
  Stored in directory: /root/.cache/pip/wheels/68/20/f9/a11a0dd63f4c13678b2a5ec488e48078756505c7777b75b29e
  Building wheel for iopath (setup.py) ... done
  Created wheel for iopath: filename=iopath-0.1.10-py3-none-any.whl size=31549 sha256=134bb754194a23c923c5a89dfe19e8230d09ca07b77ff4fcae5f7cb38b66e2f1
  Stored in directory: /root/.cache/pip/wheels/aa/cc/ed/ca4e88beef656b01c84b9185196513ef2faf74a5a379b043a7
Successfully built fvcore iopath
Installing collected packages: portalocker, yacs, iopath, mock, fvcore, detectron2
Successfully installed detectron2-0.1.3+cu101 fvcore-0.1.5.post20220512 iopath-0.1.10 mock-4.0.3 portalocker-2.5.1 yacs-0.1.8

Relevant imports

In [ ]:
# You may need to restart your runtime prior to this, to let your installation take effect
# Some basic setup:
# Setup detectron2 logger
import detectron2
from detectron2.utils.logger import setup_logger
setup_logger()

# import some common libraries
import numpy as np
import cv2
import random
from google.colab.patches import cv2_imshow
import random
import os
import glob

# import some common detectron2 utilities
from detectron2 import model_zoo
from detectron2.engine import DefaultPredictor, DefaultTrainer
from detectron2.config import get_cfg
from detectron2.utils.visualizer import Visualizer, ColorMode
from detectron2.data import DatasetCatalog, MetadataCatalog, build_detection_test_loader
from detectron2.data.catalog import DatasetCatalog
from detectron2.data.datasets import register_coco_instances
from detectron2.evaluation import COCOEvaluator, inference_on_dataset
from detectron2.modeling import build_model

Import and Register my custom Dataset

Note: if the link will expire let me know and I'll send a new one

In [ ]:
!curl -L "https://app.roboflow.com/ds/CGdYIqBsHg?key=2WZlODtuOg" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip
!rm -f *.txt
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100   892  100   892    0     0   1429      0 --:--:-- --:--:-- --:--:--  1429
100 5723k  100 5723k    0     0  3493k      0  0:00:01  0:00:01 --:--:-- 7083k
Archive:  roboflow.zip
 extracting: README.dataset.txt      
 extracting: README.roboflow.txt     
   creating: test/
 extracting: test/00000266_co_png.rf.d52cbe2b4b992346d4009cff1f923f61.jpg  
 extracting: test/00000268_co_png.rf.adada98eba89c01aa93258fcfdc03937.jpg  
 extracting: test/00000271_co_png.rf.07fc054292145538bb081e9d8e718cdb.jpg  
 extracting: test/00000272_co_png.rf.c52618161985e36efb1e5528e3b379b5.jpg  
 extracting: test/00000274_co_png.rf.cd935e4a3b74749b7f37e3c66a41a9ea.jpg  
 extracting: test/00000275_co_png.rf.cc1d34e2a615d0d4496e3486e1acdef3.jpg  
 extracting: test/00000276_co_png.rf.a196ad96b2a8a80273dd3d58f7a3c738.jpg  
 extracting: test/00000278_co_png.rf.000b56364382882add11e6eaf41a5e5b.jpg  
 extracting: test/00000279_co_png.rf.6928e406455ac417a308d36a1c2d16ba.jpg  
 extracting: test/00000280_co_png.rf.d5376a828d7bb60dd639da740407393b.jpg  
 extracting: test/00000281_co_png.rf.1926133066b5e717081489f820104f0c.jpg  
 extracting: test/00000282_co_png.rf.03941b08108d78b8299a94de94ef19c2.jpg  
 extracting: test/00000283_co_png.rf.f702d5b45319ca59c514421cd511e5ad.jpg  
 extracting: test/00000284_co_png.rf.3617d46739297d025037e30fc93f75b1.jpg  
 extracting: test/00000285_co_png.rf.31310bf523d2e2b264535f06028b2183.jpg  
 extracting: test/00000286_co_png.rf.63734907043a5028cba27d42902c98f0.jpg  
 extracting: test/00000287_co_png.rf.799e739c7f68e74bd2ed61da0ded4027.jpg  
 extracting: test/00000288_co_png.rf.0173c05a5c92a1a38ce0e8297b62656e.jpg  
 extracting: test/00000289_co_png.rf.3a4d97d6d595c59b84a4350557025436.jpg  
 extracting: test/00000293_co_png.rf.873a2fd2cceabe448f7d63218f773e1c.jpg  
 extracting: test/00000295_co_png.rf.ecfddb62d87bf3d1093f427bb6528f89.jpg  
 extracting: test/00000298_co_png.rf.01da5121d96add751e9576b58ba26777.jpg  
 extracting: test/00000299_co_png.rf.5001a083eb60c1b3b0c4799691d9e117.jpg  
 extracting: test/00000300_co_png.rf.17ff89cf9978650a6085c8e74b5d7448.jpg  
 extracting: test/00000302_co_png.rf.0fc96a833631ec1bb79e8b0814b372e1.jpg  
 extracting: test/00000303_co_png.rf.5e54caeba2cfea103a37004c0e4ba0cb.jpg  
 extracting: test/00000307_co_png.rf.aefdc6cacd2413471a197b1d889e0685.jpg  
 extracting: test/00000308_co_png.rf.f3e056da60b4e9d8660ae905f87ff42b.jpg  
 extracting: test/00000309_co_png.rf.17e79d1ac5ad002588512b3de0ed9eb6.jpg  
 extracting: test/00000312_co_png.rf.a73985f20b42c36570acbd95b4c8600c.jpg  
 extracting: test/00000315_co_png.rf.6de4a46482e928403e6c5c329356d4d7.jpg  
 extracting: test/00000317_co_png.rf.2e8da912578d384d6c87e08aeeb4addc.jpg  
 extracting: test/00000318_co_png.rf.29577a521b20a72e5947d3b4b3e4766b.jpg  
 extracting: test/00000319_co_png.rf.f12285f93bb585acd36dcba02964a452.jpg  
 extracting: test/00000548_co_png.rf.9b0c3243f943ddc9df0f9a072bc96384.jpg  
 extracting: test/00000555_co_png.rf.0b25de64d8566a7d1c79fd88bac49ba7.jpg  
 extracting: test/00000568_co_png.rf.90471cf9b3a5d2866db1bcd973cd9170.jpg  
 extracting: test/00000582_co_png.rf.a3d88b90abd9b5dac4cf4d6fa08acd86.jpg  
 extracting: test/00000587_co_png.rf.bc3327b346b12082430a05e8d1637672.jpg  
 extracting: test/00000594_co_png.rf.58747fcff783f884d62a1db1a47b0207.jpg  
 extracting: test/_annotations.coco.json  
   creating: train/
 extracting: train/00000267_co_png.rf.0ccd4864e9a4450b6f604eb37ecfc1a8.jpg  
 extracting: train/00000269_co_png.rf.d0f3aa521b0ece3b005a8cafcbe41a36.jpg  
 extracting: train/00000270_co_png.rf.e7e4b13aadb23f4ce3c486c3add85dc7.jpg  
 extracting: train/00000273_co_png.rf.0a3476fea12b9c20db035f0daa08d4d5.jpg  
 extracting: train/00000277_co_png.rf.8255348a3fdb5fcc0d76f1584b690fb1.jpg  
 extracting: train/00000290_co_png.rf.ebcb5a6bd2d88cbe0941e14ff02e8fad.jpg  
 extracting: train/00000291_co_png.rf.6e04348c411070715e00967d8ae7d845.jpg  
 extracting: train/00000292_co_png.rf.35687ffd365c58e8c0b275fac20f2e44.jpg  
 extracting: train/00000294_co_png.rf.c1d201e699a0af47b4270b9b1aaa4cd2.jpg  
 extracting: train/00000296_co_png.rf.a960c14460aa516564efb7261707b109.jpg  
 extracting: train/00000297_co_png.rf.66dfc34a1dc1504810f31973ba0b2039.jpg  
 extracting: train/00000301_co_png.rf.6f503504a6e4daead2fa1e17ff6a67f0.jpg  
 extracting: train/00000304_co_png.rf.52c5b554a01f65cf1c9ee446f7dcb407.jpg  
 extracting: train/00000305_co_png.rf.c52f550fd8fcf60471a4436603f119b4.jpg  
 extracting: train/00000306_co_png.rf.60e88cfc3de74534901c1b82c790f78c.jpg  
 extracting: train/00000310_co_png.rf.cef4b6e51c1846b02f5a06a789212d16.jpg  
 extracting: train/00000311_co_png.rf.3b016fdd28a95de9b4814858f6d263e1.jpg  
 extracting: train/00000316_co_png.rf.f1eee848a7ca04fe67ba1a49566bddf2.jpg  
 extracting: train/00000534_co_png.rf.7e10e8d76b52137ef9d47d990536cee1.jpg  
 extracting: train/00000536_co_png.rf.4f53094747facbc11e5f7740fb6d3819.jpg  
 extracting: train/00000537_co_png.rf.2ed5c658caf0031f7c43e1e7a31e2267.jpg  
 extracting: train/00000540_co_png.rf.9ba3b61fa463e58279231def4ce80fa4.jpg  
 extracting: train/00000541_co_png.rf.c4af049b34f5dae7ff3c4fbc22138d15.jpg  
 extracting: train/00000542_co_png.rf.dfc2ca18cd13840dbdcc4be7583111fb.jpg  
 extracting: train/00000543_co_png.rf.3473a6f96a923d1d1a106b32c8a0d4d4.jpg  
 extracting: train/00000544_co_png.rf.8ef793d4aadb9bee1ec9250d28cff02c.jpg  
 extracting: train/00000547_co_png.rf.cd892e14ced55238bf7a8570ad49c6b2.jpg  
 extracting: train/00000549_co_png.rf.53fff3921e14cd7c0a593c6906cfdee5.jpg  
 extracting: train/00000550_co_png.rf.ac5a63b5ced2c7d961bb06a884930418.jpg  
 extracting: train/00000551_co_png.rf.2314b153b556e1e5d12c62869df791f8.jpg  
 extracting: train/00000552_co_png.rf.a8834a205a545bb443ad2f44369108f9.jpg  
 extracting: train/00000553_co_png.rf.57e2994ae96b6c45bafe63979a6b1bdd.jpg  
 extracting: train/00000554_co_png.rf.379eec3504d0343d21d42fc3222af449.jpg  
 extracting: train/00000557_co_png.rf.12a65a5aa10632938ba93f7457237180.jpg  
 extracting: train/00000558_co_png.rf.e62112e71797b65428c44320ad52d663.jpg  
 extracting: train/00000559_co_png.rf.c2d2cdceaeada0e41ca77cc266ffbfde.jpg  
 extracting: train/00000561_co_png.rf.c0e100771651f5fb2d43e2f96a1df3fb.jpg  
 extracting: train/00000562_co_png.rf.d4be02e068051aa45f10df9e111cf0a5.jpg  
 extracting: train/00000563_co_png.rf.64b3e8bbfd738b2b3711e3db08180ea9.jpg  
 extracting: train/00000564_co_png.rf.28bd4e2b837ce01f6a2e059d76bb0944.jpg  
 extracting: train/00000565_co_png.rf.71f7d890fed6f3587911ee1e3975eb68.jpg  
 extracting: train/00000566_co_png.rf.101bcf8e64d7ef2547da136765c2ab1f.jpg  
 extracting: train/00000567_co_png.rf.47ce0b7ffba55aeb8a52ff83bc1e2ae3.jpg  
 extracting: train/00000569_co_png.rf.0c0ace952454065ae7f2c33da50cb728.jpg  
 extracting: train/00000570_co_png.rf.2a502639d1893e4f9ba82f2bcc89a07a.jpg  
 extracting: train/00000573_co_png.rf.7cd099696aaee46fe52329d61254503a.jpg  
 extracting: train/00000575_co_png.rf.55ff258f27c5aec0a7448e6effa7bb9e.jpg  
 extracting: train/00000577_co_png.rf.8ea72469282ffd2f139afc5026a8b4c1.jpg  
 extracting: train/00000578_co_png.rf.a0151c0270f688a092dab819a4ea94f4.jpg  
 extracting: train/00000579_co_png.rf.13a9bf95b4c7339a31d76374c081ab1e.jpg  
 extracting: train/00000580_co_png.rf.3e9df28a074589935c05d896579c9b96.jpg  
 extracting: train/00000585_co_png.rf.de5e53a2e5ea26ed775ce287281108eb.jpg  
 extracting: train/00000586_co_png.rf.6cc1033899d2be17e15b42012fbed22b.jpg  
 extracting: train/00000588_co_png.rf.a30758e75bc433110c1b483a037815b1.jpg  
 extracting: train/00000589_co_png.rf.fba98d43a62b53c8fe05be3893b206a3.jpg  
 extracting: train/00000590_co_png.rf.6dbe7d3aca042b9476156832909bbe28.jpg  
 extracting: train/00000592_co_png.rf.5c8e42d398468e481f135a5b9ef31881.jpg  
 extracting: train/00000593_co_png.rf.6e6555db9e7f9b3ab72d418afe241c6d.jpg  
 extracting: train/00000595_co_png.rf.e9dc5ef39cdeaa0a65af9948e2a67969.jpg  
 extracting: train/00000596_co_png.rf.c71749e1be314ac76cc6f9e15c90c6d8.jpg  
 extracting: train/00000597_co_png.rf.e1fa2a86627da23e6c2ffd01ba7a3553.jpg  
 extracting: train/00000598_co_png.rf.3fdcf67d894c7ed54ac2587952908873.jpg  
 extracting: train/00000599_co_png.rf.e68cdfbd175d412976011ce248b2316a.jpg  
 extracting: train/00000601_co_png.rf.f865ddfe62005641985545d30e495b16.jpg  
 extracting: train/00000602_co_png.rf.d5af4dec6f03c463fb279de3f13100dd.jpg  
 extracting: train/00000603_co_png.rf.500cad664275275b266b4e9e79d3c994.jpg  
 extracting: train/00000604_co_png.rf.91bafe4f00506b8e480ebef2e6bafa1d.jpg  
 extracting: train/00000605_co_png.rf.21779fa081156cfa6e403f25f7617c3e.jpg  
 extracting: train/00000607_co_png.rf.32cc474583d2955647e59ae24be01d6e.jpg  
 extracting: train/00000608_co_png.rf.2085593185e0f2679b8634371cb47236.jpg  
 extracting: train/00000609_co_png.rf.ca824722dea417a4ba6ad85ee17d8317.jpg  
 extracting: train/00000610_co_png.rf.46a0ccb5f0ee434b4ab84dd618f2ce43.jpg  
 extracting: train/00000611_co_png.rf.966d05f76017a690990cfa156d8df28c.jpg  
 extracting: train/00000612_co_png.rf.0c5125390ee327f33a481c3961ebd84a.jpg  
 extracting: train/00000613_co_png.rf.3c5475c2366b6eaeb0b240557455185e.jpg  
 extracting: train/00000614_co_png.rf.71b8ae596c297472e4e18a522afbbc99.jpg  
 extracting: train/00000615_co_png.rf.fd8ab599083cc0d49db5930bad0ce444.jpg  
 extracting: train/00000616_co_png.rf.2c06e199bd590f7ab2ccf5ec260f38ff.jpg  
 extracting: train/00000617_co_png.rf.a66f8027b85afff851c836f636027a6d.jpg  
 extracting: train/00000618_co_png.rf.64867f7f460a825c533bee3d5957d6af.jpg  
 extracting: train/00000619_co_png.rf.a2913ca56a6a805d875c0098fef547aa.jpg  
 extracting: train/00000620_co_png.rf.01e842a62572d946abd3b3b1e33ff006.jpg  
 extracting: train/00000621_co_png.rf.29745456f6c3a4724c1ce9a5ce44ab55.jpg  
 extracting: train/00000622_co_png.rf.5ff4bf4aee1040dace01b11c803b1347.jpg  
 extracting: train/00000623_co_png.rf.b72f15bec8ea2aa558a0db7fb50d4215.jpg  
 extracting: train/00000624_co_png.rf.e2cb7ad67e31d56347a59e5bed264e6d.jpg  
 extracting: train/00000625_co_png.rf.cd602a90557b624df33bae867eb031a7.jpg  
 extracting: train/00000626_co_png.rf.9f3d08394211564b466a144e2a34ead4.jpg  
 extracting: train/00000627_co_png.rf.a32c9c6149a9ff8d667d9f1f50f7d32e.jpg  
 extracting: train/00000628_co_png.rf.18cafc8b20326694e69add6444be4896.jpg  
 extracting: train/00000629_co_png.rf.b26b591c155cc028da4ef88c04d17bbc.jpg  
 extracting: train/00000630_co_png.rf.d9c9e9ddd9bcf68ec2d78bb957eeaa42.jpg  
 extracting: train/00000631_co_png.rf.39d37a44c52e5b236c391b3fa3b59d5f.jpg  
 extracting: train/00000632_co_png.rf.27f96770999e568075affd37058362bb.jpg  
 extracting: train/00000633_co_png.rf.560cbe0e0e609dc09f96dca4dc441b75.jpg  
 extracting: train/00000634_co_png.rf.2ff2ba8264106e05c040506fdb0f89cf.jpg  
 extracting: train/00000635_co_png.rf.26aa8aedb4600c85963ed8eb0b1ff806.jpg  
 extracting: train/00000636_co_png.rf.52beec6ef64d1b4cc8a8c80dba3714b3.jpg  
 extracting: train/00000637_co_png.rf.66281581f6d78a1823e27393c1715898.jpg  
 extracting: train/00000638_co_png.rf.c2ea7883861ce8f67fed04ecc6da6fc1.jpg  
 extracting: train/00000639_co_png.rf.d3111d6e7fb9e91e99542dfe6cd504cb.jpg  
 extracting: train/00000640_co_png.rf.ea61cde8b0d42989661251dc445580b0.jpg  
 extracting: train/00000641_co_png.rf.94c1a1610282127ad214bc692508e3d4.jpg  
 extracting: train/00000642_co_png.rf.d04bad534b8065b224ae10f024104a70.jpg  
 extracting: train/00000643_co_png.rf.301873a155ca88f20c039a9eaf2aa628.jpg  
 extracting: train/00000644_co_png.rf.e5bb4ac5170c35ed81e1ffd795aad026.jpg  
 extracting: train/00000645_co_png.rf.7c68ea8db8d994f49263f4a03b13eae1.jpg  
 extracting: train/00000646_co_png.rf.4d1528ec75bd4f4ac75cd5f2d19c9d2c.jpg  
 extracting: train/00000647_co_png.rf.621b32c73cb3de15ef649ed6482a0315.jpg  
 extracting: train/00000648_co_png.rf.d2eb67676f2ff3b36f05eb6bd5e1f558.jpg  
 extracting: train/00000649_co_png.rf.eb09f29d58f94e4ff5dbd61f2c0eb76c.jpg  
 extracting: train/00000650_co_png.rf.af78027f78d5f4982ac8d2d417926c96.jpg  
 extracting: train/00000651_co_png.rf.cd1bb79604b495528706f9d2a59aab32.jpg  
 extracting: train/00000652_co_png.rf.1a286d3113572b2acbb567cf7de28bc5.jpg  
 extracting: train/00000653_co_png.rf.772a5e851c0dd12c257d17d681016034.jpg  
 extracting: train/00000654_co_png.rf.d2fd5a1eb8bdf35a5f74f1f2d1d33845.jpg  
 extracting: train/00000655_co_png.rf.7911c9e96bf081258f0b8cf201d0c837.jpg  
 extracting: train/00000656_co_png.rf.6e234b6a44a060f72d5c93ca0f4d49d8.jpg  
 extracting: train/00000657_co_png.rf.7848747f48920308c197a3244db47e1b.jpg  
 extracting: train/00000658_co_png.rf.373ea2bb530d049dea863fc6d350bf89.jpg  
 extracting: train/00000659_co_png.rf.5a5e3bdff494e85265e14812c6371733.jpg  
 extracting: train/00000660_co_png.rf.fcf1a5d8dc8b4eaef27fc5e8a35df973.jpg  
 extracting: train/00000661_co_png.rf.4b5fe5062bf59db8944be575d36002b5.jpg  
 extracting: train/00000662_co_png.rf.aa8d3e144261ecdc70c335979e86f9cd.jpg  
 extracting: train/00000663_co_png.rf.79da21ec01ed3eeb774f691d67c96baf.jpg  
 extracting: train/00000664_co_png.rf.b6d76f675fb23f6d5e5c8ab69c175622.jpg  
 extracting: train/00000665_co_png.rf.69eb5ceb42d6e7ef5dc6f7ac2a21daab.jpg  
 extracting: train/00000666_co_png.rf.ad88041adb86086e395cc567d230dd67.jpg  
 extracting: train/00000667_co_png.rf.e0ff0c78e7d95c1d87fe982f53ff66ce.jpg  
 extracting: train/00000668_co_png.rf.80d8a3111f806b8502a74bac23ea9dfa.jpg  
 extracting: train/00000669_co_png.rf.1641ff92d29e2184c646860910b93a4f.jpg  
 extracting: train/00000670_co_png.rf.179e2c96647ac7548fc74afa6224cc93.jpg  
 extracting: train/00000671_co_png.rf.fd0c52df26c7eca2acc12d079a14fb3d.jpg  
 extracting: train/00000672_co_png.rf.acaf0cbd16b2ccee89fe851ef00c5c1e.jpg  
 extracting: train/00000673_co_png.rf.9993a1259c10c1a6863cd0b94dbab0e5.jpg  
 extracting: train/00000674_co_png.rf.f141beb272ba06a3b646df253a9b145d.jpg  
 extracting: train/00000675_co_png.rf.92704b802a0b439eb1c96d00b11b3ed9.jpg  
 extracting: train/00000676_co_png.rf.78d81b6b09a7be07a397af9229ba5e93.jpg  
 extracting: train/00000677_co_png.rf.b2e69ab428bf098664677b6a85a0884c.jpg  
 extracting: train/00000678_co_png.rf.e00ed743335d660882708ce613b52db5.jpg  
 extracting: train/00000679_co_png.rf.7d70b103821b4142e43f52f653950e3b.jpg  
 extracting: train/00000680_co_png.rf.45ded170ecfc327de476036027811614.jpg  
 extracting: train/00000681_co_png.rf.48a89166a2a7b441bbb9918278b250d6.jpg  
 extracting: train/00000682_co_png.rf.a7a4ba92849ad4b7c5ed0a4b29a43b27.jpg  
 extracting: train/00000683_co_png.rf.8636ea9d0abd45a72c03d77a135cced0.jpg  
 extracting: train/00000684_co_png.rf.76c6eabf7f74b76366bc439f38a49319.jpg  
 extracting: train/00000685_co_png.rf.0a00a521e80ef8211dbc81b0f49178fb.jpg  
 extracting: train/00000686_co_png.rf.d655e6558e9d50d696748a6f76b12380.jpg  
 extracting: train/00000687_co_png.rf.7e21cfd09d03cea3b21751479007ec88.jpg  
 extracting: train/00000688_co_png.rf.fc4b839f9e739ef9e038614b1123b58e.jpg  
 extracting: train/_annotations.coco.json  
   creating: valid/
 extracting: valid/00000535_co_png.rf.4a5addf3a3ede9f3ae9f2c1c590046d7.jpg  
 extracting: valid/00000545_co_png.rf.b9644c032c02a8d17496cf4ff0f843c9.jpg  
 extracting: valid/00000546_co_png.rf.c08d5342b76d6c860e7996122ef970e5.jpg  
 extracting: valid/00000556_co_png.rf.591bb9fa52d3509e579e9c8ab25b9020.jpg  
 extracting: valid/00000571_co_png.rf.0393008d677d2f4f845ef360e8b5385c.jpg  
 extracting: valid/00000572_co_png.rf.d5873b9e22a29a46e2fb1915aea34566.jpg  
 extracting: valid/00000574_co_png.rf.f178af6d042313237b141123fe6b5efc.jpg  
 extracting: valid/00000576_co_png.rf.06bf1d628e9e5edee8ff0b3230a10a41.jpg  
 extracting: valid/00000583_co_png.rf.9b65de842574bcc4b2b2084b5df100a3.jpg  
 extracting: valid/00000591_co_png.rf.19fd54775f29e93d3f6df0553668d778.jpg  
 extracting: valid/_annotations.coco.json  

Register & Visualize the dataset

In [ ]:
#Register to my custom dataset: train - 160, valid - 10, test - 40
register_coco_instances("my_dataset_train", {}, "/content/train/_annotations.coco.json", "/content/train")
register_coco_instances("my_dataset_val", {}, "/content/valid/_annotations.coco.json", "/content/valid")
register_coco_instances("my_dataset_test", {}, "/content/test/_annotations.coco.json", "/content/test")
In [ ]:
#visualize training data
my_dataset_train_metadata = MetadataCatalog.get("my_dataset_train")
dataset_dicts = DatasetCatalog.get("my_dataset_train")

for d in random.sample(dataset_dicts, 3):
    img = cv2.imread(d["file_name"])
    visualizer = Visualizer(img[:, :, ::-1], metadata=my_dataset_train_metadata, scale=0.5)
    vis = visualizer.draw_dataset_dict(d)
    cv2_imshow(vis.get_image()[:, :, ::-1])
WARNING [09/28 16:59:14 d2.data.datasets.coco]: 
Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

[09/28 16:59:14 d2.data.datasets.coco]: Loaded 150 images in COCO format from /content/train/_annotations.coco.json

Train my custom detector

In [ ]:
#We are importing our own Trainer Module here to use the COCO validation evaluation during training. Otherwise no validation eval occurs.
class CocoTrainer(DefaultTrainer):

  @classmethod
  def build_evaluator(cls, cfg, dataset_name, output_folder=None):

    if output_folder is None:
        os.makedirs("coco_eval", exist_ok=True)
        output_folder = "coco_eval"

    return COCOEvaluator(dataset_name, cfg, False, output_folder)
In [ ]:
cfg = get_cfg() # cfgNode
cfg.merge_from_file(model_zoo.get_config_file("COCO-Detection/faster_rcnn_X_101_32x8d_FPN_3x.yaml")) # get the model configuration
cfg.DATASETS.TRAIN = ("my_dataset_train",) # set the train dataset
cfg.DATASETS.TEST = ("my_dataset_val",) # set the validation dataset
cfg.DATALOADER.NUM_WORKERS = 4 # num of workers is the computer num of cores
cfg.MODEL.WEIGHTS = model_zoo.get_checkpoint_url("COCO-Detection/faster_rcnn_X_101_32x8d_FPN_3x.yaml")  # Let training initialize from model zoo
cfg.SOLVER.IMS_PER_BATCH = 4 # default definition
cfg.SOLVER.BASE_LR = 0.001 # default definition
cfg.SOLVER.WARMUP_ITERS = 1000 
cfg.SOLVER.MAX_ITER = 1000 # epochs number
cfg.SOLVER.STEPS = (1000, 1500) 
cfg.SOLVER.GAMMA = 0.05
cfg.MODEL.ROI_HEADS.BATCH_SIZE_PER_IMAGE = 64
cfg.MODEL.ROI_HEADS.NUM_CLASSES = 2 #number of classes + 1
cfg.TEST.EVAL_PERIOD = 500

os.makedirs(cfg.OUTPUT_DIR, exist_ok=True)
trainer = CocoTrainer(cfg) # create the trainer
trainer.resume_or_load(resume=False)
trainer.train() # train
[09/04 18:28:52 d2.engine.defaults]: Model:
GeneralizedRCNN(
  (backbone): FPN(
    (fpn_lateral2): Conv2d(256, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output2): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral3): Conv2d(512, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output3): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral4): Conv2d(1024, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output4): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (fpn_lateral5): Conv2d(2048, 256, kernel_size=(1, 1), stride=(1, 1))
    (fpn_output5): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
    (top_block): LastLevelMaxPool()
    (bottom_up): ResNet(
      (stem): BasicStem(
        (conv1): Conv2d(
          3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False
          (norm): FrozenBatchNorm2d(num_features=64, eps=1e-05)
        )
      )
      (res2): Sequential(
        (0): BottleneckBlock(
          (shortcut): Conv2d(
            64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv1): Conv2d(
            64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv2): Conv2d(
            256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv3): Conv2d(
            256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
        )
        (1): BottleneckBlock(
          (conv1): Conv2d(
            256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv2): Conv2d(
            256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv3): Conv2d(
            256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
        )
        (2): BottleneckBlock(
          (conv1): Conv2d(
            256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv2): Conv2d(
            256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
          (conv3): Conv2d(
            256, 256, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=256, eps=1e-05)
          )
        )
      )
      (res3): Sequential(
        (0): BottleneckBlock(
          (shortcut): Conv2d(
            256, 512, kernel_size=(1, 1), stride=(2, 2), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv1): Conv2d(
            256, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv2): Conv2d(
            512, 512, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv3): Conv2d(
            512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
        )
        (1): BottleneckBlock(
          (conv1): Conv2d(
            512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv2): Conv2d(
            512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv3): Conv2d(
            512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
        )
        (2): BottleneckBlock(
          (conv1): Conv2d(
            512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv2): Conv2d(
            512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv3): Conv2d(
            512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
        )
        (3): BottleneckBlock(
          (conv1): Conv2d(
            512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv2): Conv2d(
            512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
          (conv3): Conv2d(
            512, 512, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=512, eps=1e-05)
          )
        )
      )
      (res4): Sequential(
        (0): BottleneckBlock(
          (shortcut): Conv2d(
            512, 1024, kernel_size=(1, 1), stride=(2, 2), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv1): Conv2d(
            512, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (1): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (2): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (3): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (4): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (5): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (6): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (7): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (8): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (9): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (10): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (11): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (12): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (13): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (14): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (15): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (16): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (17): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (18): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (19): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (20): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (21): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
        (22): BottleneckBlock(
          (conv1): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv2): Conv2d(
            1024, 1024, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
          (conv3): Conv2d(
            1024, 1024, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=1024, eps=1e-05)
          )
        )
      )
      (res5): Sequential(
        (0): BottleneckBlock(
          (shortcut): Conv2d(
            1024, 2048, kernel_size=(1, 1), stride=(2, 2), bias=False
            (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
          )
          (conv1): Conv2d(
            1024, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
          )
          (conv2): Conv2d(
            2048, 2048, kernel_size=(3, 3), stride=(2, 2), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
          )
          (conv3): Conv2d(
            2048, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
          )
        )
        (1): BottleneckBlock(
          (conv1): Conv2d(
            2048, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
          )
          (conv2): Conv2d(
            2048, 2048, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
          )
          (conv3): Conv2d(
            2048, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
          )
        )
        (2): BottleneckBlock(
          (conv1): Conv2d(
            2048, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
          )
          (conv2): Conv2d(
            2048, 2048, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), groups=32, bias=False
            (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
          )
          (conv3): Conv2d(
            2048, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False
            (norm): FrozenBatchNorm2d(num_features=2048, eps=1e-05)
          )
        )
      )
    )
  )
  (proposal_generator): RPN(
    (anchor_generator): DefaultAnchorGenerator(
      (cell_anchors): BufferList()
    )
    (rpn_head): StandardRPNHead(
      (conv): Conv2d(256, 256, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1))
      (objectness_logits): Conv2d(256, 3, kernel_size=(1, 1), stride=(1, 1))
      (anchor_deltas): Conv2d(256, 12, kernel_size=(1, 1), stride=(1, 1))
    )
  )
  (roi_heads): StandardROIHeads(
    (box_pooler): ROIPooler(
      (level_poolers): ModuleList(
        (0): ROIAlign(output_size=(7, 7), spatial_scale=0.25, sampling_ratio=0, aligned=True)
        (1): ROIAlign(output_size=(7, 7), spatial_scale=0.125, sampling_ratio=0, aligned=True)
        (2): ROIAlign(output_size=(7, 7), spatial_scale=0.0625, sampling_ratio=0, aligned=True)
        (3): ROIAlign(output_size=(7, 7), spatial_scale=0.03125, sampling_ratio=0, aligned=True)
      )
    )
    (box_head): FastRCNNConvFCHead(
      (fc1): Linear(in_features=12544, out_features=1024, bias=True)
      (fc2): Linear(in_features=1024, out_features=1024, bias=True)
    )
    (box_predictor): FastRCNNOutputLayers(
      (cls_score): Linear(in_features=1024, out_features=3, bias=True)
      (bbox_pred): Linear(in_features=1024, out_features=8, bias=True)
    )
  )
)
WARNING [09/04 18:28:52 d2.data.datasets.coco]: 
Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

[09/04 18:28:52 d2.data.datasets.coco]: Loaded 150 images in COCO format from /content/train/_annotations.coco.json
[09/04 18:28:52 d2.data.build]: Removed 4 images with no usable annotations. 146 images left.
[09/04 18:28:52 d2.data.common]: Serializing 146 elements to byte tensors and concatenating them all ...
[09/04 18:28:52 d2.data.common]: Serialized dataset takes 0.05 MiB
[09/04 18:28:52 d2.data.detection_utils]: TransformGens used in training: [ResizeShortestEdge(short_edge_length=(640, 672, 704, 736, 768, 800), max_size=1333, sample_style='choice'), RandomFlip()]
[09/04 18:28:52 d2.data.build]: Using training sampler TrainingSampler
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.cls_score.weight' to the model due to incompatible shapes: (81, 1024) in the checkpoint but (3, 1024) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.cls_score.bias' to the model due to incompatible shapes: (81,) in the checkpoint but (3,) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.bbox_pred.weight' to the model due to incompatible shapes: (320, 1024) in the checkpoint but (8, 1024) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Skip loading parameter 'roi_heads.box_predictor.bbox_pred.bias' to the model due to incompatible shapes: (320,) in the checkpoint but (8,) in the model! You might want to double check if this is expected.
WARNING:fvcore.common.checkpoint:Some model parameters or buffers are not found in the checkpoint:
roi_heads.box_predictor.bbox_pred.{bias, weight}
roi_heads.box_predictor.cls_score.{bias, weight}
[09/04 18:28:52 d2.engine.train_loop]: Starting training from iteration 0
[09/04 18:29:32 d2.utils.events]:  eta: 0:33:35  iter: 19  total_loss: 1.718  loss_cls: 0.942  loss_box_reg: 0.466  loss_rpn_cls: 0.200  loss_rpn_loc: 0.021  time: 2.0451  data_time: 0.0200  lr: 0.000020  max_mem: 8426M
[09/04 18:30:16 d2.utils.events]:  eta: 0:34:11  iter: 39  total_loss: 1.505  loss_cls: 0.748  loss_box_reg: 0.518  loss_rpn_cls: 0.161  loss_rpn_loc: 0.015  time: 2.1252  data_time: 0.0093  lr: 0.000040  max_mem: 8426M
[09/04 18:30:59 d2.utils.events]:  eta: 0:33:38  iter: 59  total_loss: 1.408  loss_cls: 0.596  loss_box_reg: 0.647  loss_rpn_cls: 0.094  loss_rpn_loc: 0.018  time: 2.1330  data_time: 0.0102  lr: 0.000060  max_mem: 8426M
[09/04 18:31:45 d2.utils.events]:  eta: 0:33:16  iter: 79  total_loss: 1.323  loss_cls: 0.509  loss_box_reg: 0.731  loss_rpn_cls: 0.025  loss_rpn_loc: 0.017  time: 2.1658  data_time: 0.0092  lr: 0.000080  max_mem: 8426M
[09/04 18:32:30 d2.utils.events]:  eta: 0:32:46  iter: 99  total_loss: 1.216  loss_cls: 0.457  loss_box_reg: 0.715  loss_rpn_cls: 0.032  loss_rpn_loc: 0.017  time: 2.1820  data_time: 0.0087  lr: 0.000100  max_mem: 8426M
[09/04 18:33:16 d2.utils.events]:  eta: 0:32:27  iter: 119  total_loss: 1.156  loss_cls: 0.394  loss_box_reg: 0.739  loss_rpn_cls: 0.021  loss_rpn_loc: 0.015  time: 2.2022  data_time: 0.0100  lr: 0.000120  max_mem: 8426M
[09/04 18:34:00 d2.utils.events]:  eta: 0:31:55  iter: 139  total_loss: 1.143  loss_cls: 0.344  loss_box_reg: 0.719  loss_rpn_cls: 0.022  loss_rpn_loc: 0.018  time: 2.2077  data_time: 0.0094  lr: 0.000140  max_mem: 8426M
[09/04 18:34:45 d2.utils.events]:  eta: 0:31:12  iter: 159  total_loss: 1.059  loss_cls: 0.292  loss_box_reg: 0.728  loss_rpn_cls: 0.013  loss_rpn_loc: 0.016  time: 2.2136  data_time: 0.0090  lr: 0.000160  max_mem: 8426M
[09/04 18:35:29 d2.utils.events]:  eta: 0:30:23  iter: 179  total_loss: 1.027  loss_cls: 0.268  loss_box_reg: 0.716  loss_rpn_cls: 0.019  loss_rpn_loc: 0.017  time: 2.2094  data_time: 0.0101  lr: 0.000180  max_mem: 8426M
[09/04 18:36:14 d2.utils.events]:  eta: 0:29:43  iter: 199  total_loss: 0.897  loss_cls: 0.212  loss_box_reg: 0.667  loss_rpn_cls: 0.012  loss_rpn_loc: 0.014  time: 2.2147  data_time: 0.0103  lr: 0.000200  max_mem: 8426M
[09/04 18:37:00 d2.utils.events]:  eta: 0:29:02  iter: 219  total_loss: 0.942  loss_cls: 0.249  loss_box_reg: 0.642  loss_rpn_cls: 0.014  loss_rpn_loc: 0.016  time: 2.2214  data_time: 0.0090  lr: 0.000220  max_mem: 8426M
[09/04 18:37:45 d2.utils.events]:  eta: 0:28:16  iter: 239  total_loss: 0.809  loss_cls: 0.203  loss_box_reg: 0.585  loss_rpn_cls: 0.011  loss_rpn_loc: 0.015  time: 2.2236  data_time: 0.0088  lr: 0.000240  max_mem: 8426M
[09/04 18:38:30 d2.utils.events]:  eta: 0:27:31  iter: 259  total_loss: 0.755  loss_cls: 0.188  loss_box_reg: 0.512  loss_rpn_cls: 0.011  loss_rpn_loc: 0.017  time: 2.2241  data_time: 0.0093  lr: 0.000260  max_mem: 8426M
[09/04 18:39:13 d2.utils.events]:  eta: 0:26:44  iter: 279  total_loss: 0.733  loss_cls: 0.157  loss_box_reg: 0.541  loss_rpn_cls: 0.006  loss_rpn_loc: 0.015  time: 2.2206  data_time: 0.0094  lr: 0.000280  max_mem: 8426M
[09/04 18:39:59 d2.utils.events]:  eta: 0:26:03  iter: 299  total_loss: 0.761  loss_cls: 0.184  loss_box_reg: 0.543  loss_rpn_cls: 0.008  loss_rpn_loc: 0.019  time: 2.2264  data_time: 0.0096  lr: 0.000300  max_mem: 8426M
[09/04 18:40:46 d2.utils.events]:  eta: 0:25:22  iter: 319  total_loss: 0.652  loss_cls: 0.129  loss_box_reg: 0.459  loss_rpn_cls: 0.005  loss_rpn_loc: 0.009  time: 2.2320  data_time: 0.0096  lr: 0.000320  max_mem: 8426M
[09/04 18:41:31 d2.utils.events]:  eta: 0:24:39  iter: 339  total_loss: 0.672  loss_cls: 0.163  loss_box_reg: 0.479  loss_rpn_cls: 0.006  loss_rpn_loc: 0.015  time: 2.2336  data_time: 0.0094  lr: 0.000340  max_mem: 8426M
[09/04 18:42:15 d2.utils.events]:  eta: 0:23:53  iter: 359  total_loss: 0.667  loss_cls: 0.130  loss_box_reg: 0.506  loss_rpn_cls: 0.004  loss_rpn_loc: 0.015  time: 2.2331  data_time: 0.0090  lr: 0.000360  max_mem: 8426M
[09/04 18:43:02 d2.utils.events]:  eta: 0:23:11  iter: 379  total_loss: 0.571  loss_cls: 0.105  loss_box_reg: 0.434  loss_rpn_cls: 0.004  loss_rpn_loc: 0.009  time: 2.2396  data_time: 0.0090  lr: 0.000380  max_mem: 8426M
[09/04 18:43:48 d2.utils.events]:  eta: 0:22:27  iter: 399  total_loss: 0.635  loss_cls: 0.120  loss_box_reg: 0.494  loss_rpn_cls: 0.005  loss_rpn_loc: 0.013  time: 2.2410  data_time: 0.0094  lr: 0.000400  max_mem: 8426M
[09/04 18:44:33 d2.utils.events]:  eta: 0:21:43  iter: 419  total_loss: 0.591  loss_cls: 0.113  loss_box_reg: 0.446  loss_rpn_cls: 0.003  loss_rpn_loc: 0.015  time: 2.2411  data_time: 0.0091  lr: 0.000420  max_mem: 8426M
[09/04 18:45:17 d2.utils.events]:  eta: 0:20:58  iter: 439  total_loss: 0.546  loss_cls: 0.126  loss_box_reg: 0.413  loss_rpn_cls: 0.003  loss_rpn_loc: 0.011  time: 2.2405  data_time: 0.0090  lr: 0.000440  max_mem: 8426M
[09/04 18:46:03 d2.utils.events]:  eta: 0:20:14  iter: 459  total_loss: 0.552  loss_cls: 0.103  loss_box_reg: 0.422  loss_rpn_cls: 0.004  loss_rpn_loc: 0.013  time: 2.2423  data_time: 0.0104  lr: 0.000460  max_mem: 8426M
[09/04 18:46:48 d2.utils.events]:  eta: 0:19:29  iter: 479  total_loss: 0.540  loss_cls: 0.097  loss_box_reg: 0.408  loss_rpn_cls: 0.002  loss_rpn_loc: 0.010  time: 2.2423  data_time: 0.0090  lr: 0.000480  max_mem: 8426M
WARNING [09/04 18:47:32 d2.data.datasets.coco]: 
Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

[09/04 18:47:32 d2.data.datasets.coco]: Loaded 10 images in COCO format from /content/valid/_annotations.coco.json
[09/04 18:47:32 d2.data.common]: Serializing 10 elements to byte tensors and concatenating them all ...
[09/04 18:47:32 d2.data.common]: Serialized dataset takes 0.00 MiB
[09/04 18:47:32 d2.evaluation.evaluator]: Start inference on 10 images
[09/04 18:47:35 d2.evaluation.evaluator]: Total inference time: 0:00:01.089128 (0.217826 s / img per device, on 1 devices)
[09/04 18:47:35 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:01 (0.201924 s / img per device, on 1 devices)
[09/04 18:47:35 d2.evaluation.coco_evaluation]: Preparing results for COCO format ...
[09/04 18:47:35 d2.evaluation.coco_evaluation]: Saving results to coco_eval/coco_instances_results.json
[09/04 18:47:35 d2.evaluation.coco_evaluation]: Evaluating predictions ...
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.01s).
Accumulating evaluation results...
DONE (t=0.01s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.413
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.640
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.486
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.414
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.414
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.471
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.471
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.471
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
[09/04 18:47:35 d2.evaluation.coco_evaluation]: Evaluation results for bbox: 
|   AP   |  AP50  |  AP75  |  APs   |  APm  |  APl  |
|:------:|:------:|:------:|:------:|:-----:|:-----:|
| 41.276 | 63.976 | 48.639 | 41.355 |  nan  |  nan  |
[09/04 18:47:35 d2.evaluation.coco_evaluation]: Note that some metrics cannot be computed.
[09/04 18:47:35 d2.evaluation.coco_evaluation]: Per-category bbox AP: 
| category   | AP   | category   | AP     |
|:-----------|:-----|:-----------|:-------|
| Car        | nan  | Car        | 41.276 |
[09/04 18:47:35 d2.engine.defaults]: Evaluation results for my_dataset_val in csv format:
[09/04 18:47:35 d2.evaluation.testing]: copypaste: Task: bbox
[09/04 18:47:35 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
[09/04 18:47:35 d2.evaluation.testing]: copypaste: 41.2757,63.9764,48.6386,41.3549,nan,nan
[09/04 18:47:35 d2.utils.events]:  eta: 0:18:43  iter: 499  total_loss: 0.523  loss_cls: 0.102  loss_box_reg: 0.409  loss_rpn_cls: 0.003  loss_rpn_loc: 0.013  time: 2.2406  data_time: 0.0090  lr: 0.000500  max_mem: 8426M
[09/04 18:48:19 d2.utils.events]:  eta: 0:17:58  iter: 519  total_loss: 0.508  loss_cls: 0.109  loss_box_reg: 0.393  loss_rpn_cls: 0.002  loss_rpn_loc: 0.010  time: 2.2404  data_time: 0.0093  lr: 0.000519  max_mem: 8426M
[09/04 18:49:05 d2.utils.events]:  eta: 0:17:14  iter: 539  total_loss: 0.523  loss_cls: 0.110  loss_box_reg: 0.366  loss_rpn_cls: 0.003  loss_rpn_loc: 0.011  time: 2.2414  data_time: 0.0089  lr: 0.000539  max_mem: 8426M
[09/04 18:49:50 d2.utils.events]:  eta: 0:16:30  iter: 559  total_loss: 0.504  loss_cls: 0.109  loss_box_reg: 0.373  loss_rpn_cls: 0.002  loss_rpn_loc: 0.011  time: 2.2420  data_time: 0.0094  lr: 0.000559  max_mem: 8426M
[09/04 18:50:35 d2.utils.events]:  eta: 0:15:45  iter: 579  total_loss: 0.520  loss_cls: 0.102  loss_box_reg: 0.399  loss_rpn_cls: 0.001  loss_rpn_loc: 0.011  time: 2.2429  data_time: 0.0089  lr: 0.000579  max_mem: 8426M
[09/04 18:51:22 d2.utils.events]:  eta: 0:15:00  iter: 599  total_loss: 0.456  loss_cls: 0.091  loss_box_reg: 0.371  loss_rpn_cls: 0.001  loss_rpn_loc: 0.008  time: 2.2454  data_time: 0.0092  lr: 0.000599  max_mem: 8426M
[09/04 18:52:09 d2.utils.events]:  eta: 0:14:16  iter: 619  total_loss: 0.491  loss_cls: 0.095  loss_box_reg: 0.381  loss_rpn_cls: 0.002  loss_rpn_loc: 0.012  time: 2.2490  data_time: 0.0088  lr: 0.000619  max_mem: 8426M
[09/04 18:52:54 d2.utils.events]:  eta: 0:13:31  iter: 639  total_loss: 0.459  loss_cls: 0.091  loss_box_reg: 0.341  loss_rpn_cls: 0.001  loss_rpn_loc: 0.010  time: 2.2489  data_time: 0.0095  lr: 0.000639  max_mem: 8426M
[09/04 18:53:38 d2.utils.events]:  eta: 0:12:46  iter: 659  total_loss: 0.468  loss_cls: 0.087  loss_box_reg: 0.370  loss_rpn_cls: 0.002  loss_rpn_loc: 0.012  time: 2.2480  data_time: 0.0088  lr: 0.000659  max_mem: 8426M
[09/04 18:54:25 d2.utils.events]:  eta: 0:12:03  iter: 679  total_loss: 0.483  loss_cls: 0.084  loss_box_reg: 0.367  loss_rpn_cls: 0.001  loss_rpn_loc: 0.011  time: 2.2502  data_time: 0.0098  lr: 0.000679  max_mem: 8426M
[09/04 18:55:09 d2.utils.events]:  eta: 0:11:17  iter: 699  total_loss: 0.402  loss_cls: 0.073  loss_box_reg: 0.314  loss_rpn_cls: 0.001  loss_rpn_loc: 0.010  time: 2.2492  data_time: 0.0091  lr: 0.000699  max_mem: 8426M
[09/04 18:55:55 d2.utils.events]:  eta: 0:10:33  iter: 719  total_loss: 0.416  loss_cls: 0.085  loss_box_reg: 0.307  loss_rpn_cls: 0.001  loss_rpn_loc: 0.010  time: 2.2513  data_time: 0.0098  lr: 0.000719  max_mem: 8426M
[09/04 18:56:41 d2.utils.events]:  eta: 0:09:48  iter: 739  total_loss: 0.425  loss_cls: 0.069  loss_box_reg: 0.336  loss_rpn_cls: 0.001  loss_rpn_loc: 0.010  time: 2.2526  data_time: 0.0087  lr: 0.000739  max_mem: 8426M
[09/04 18:57:27 d2.utils.events]:  eta: 0:09:03  iter: 759  total_loss: 0.457  loss_cls: 0.087  loss_box_reg: 0.355  loss_rpn_cls: 0.001  loss_rpn_loc: 0.012  time: 2.2533  data_time: 0.0086  lr: 0.000759  max_mem: 8426M
[09/04 18:58:11 d2.utils.events]:  eta: 0:08:18  iter: 779  total_loss: 0.401  loss_cls: 0.070  loss_box_reg: 0.304  loss_rpn_cls: 0.001  loss_rpn_loc: 0.007  time: 2.2524  data_time: 0.0094  lr: 0.000779  max_mem: 8426M
[09/04 18:58:56 d2.utils.events]:  eta: 0:07:33  iter: 799  total_loss: 0.394  loss_cls: 0.068  loss_box_reg: 0.311  loss_rpn_cls: 0.001  loss_rpn_loc: 0.011  time: 2.2514  data_time: 0.0092  lr: 0.000799  max_mem: 8426M
[09/04 18:59:40 d2.utils.events]:  eta: 0:06:47  iter: 819  total_loss: 0.401  loss_cls: 0.065  loss_box_reg: 0.322  loss_rpn_cls: 0.000  loss_rpn_loc: 0.009  time: 2.2504  data_time: 0.0085  lr: 0.000819  max_mem: 8426M
[09/04 19:00:26 d2.utils.events]:  eta: 0:06:02  iter: 839  total_loss: 0.397  loss_cls: 0.071  loss_box_reg: 0.317  loss_rpn_cls: 0.001  loss_rpn_loc: 0.010  time: 2.2513  data_time: 0.0093  lr: 0.000839  max_mem: 8426M
[09/04 19:01:11 d2.utils.events]:  eta: 0:05:17  iter: 859  total_loss: 0.350  loss_cls: 0.062  loss_box_reg: 0.276  loss_rpn_cls: 0.001  loss_rpn_loc: 0.009  time: 2.2515  data_time: 0.0087  lr: 0.000859  max_mem: 8426M
[09/04 19:01:57 d2.utils.events]:  eta: 0:04:32  iter: 879  total_loss: 0.392  loss_cls: 0.059  loss_box_reg: 0.316  loss_rpn_cls: 0.001  loss_rpn_loc: 0.009  time: 2.2527  data_time: 0.0093  lr: 0.000879  max_mem: 8426M
[09/04 19:02:40 d2.utils.events]:  eta: 0:03:47  iter: 899  total_loss: 0.393  loss_cls: 0.061  loss_box_reg: 0.321  loss_rpn_cls: 0.000  loss_rpn_loc: 0.010  time: 2.2509  data_time: 0.0087  lr: 0.000899  max_mem: 8426M
[09/04 19:03:25 d2.utils.events]:  eta: 0:03:02  iter: 919  total_loss: 0.361  loss_cls: 0.063  loss_box_reg: 0.287  loss_rpn_cls: 0.001  loss_rpn_loc: 0.009  time: 2.2504  data_time: 0.0085  lr: 0.000919  max_mem: 8426M
[09/04 19:04:10 d2.utils.events]:  eta: 0:02:17  iter: 939  total_loss: 0.367  loss_cls: 0.068  loss_box_reg: 0.278  loss_rpn_cls: 0.000  loss_rpn_loc: 0.010  time: 2.2507  data_time: 0.0095  lr: 0.000939  max_mem: 8426M
[09/04 19:04:56 d2.utils.events]:  eta: 0:01:32  iter: 959  total_loss: 0.385  loss_cls: 0.059  loss_box_reg: 0.298  loss_rpn_cls: 0.001  loss_rpn_loc: 0.009  time: 2.2510  data_time: 0.0091  lr: 0.000959  max_mem: 8426M
[09/04 19:05:41 d2.utils.events]:  eta: 0:00:47  iter: 979  total_loss: 0.349  loss_cls: 0.063  loss_box_reg: 0.283  loss_rpn_cls: 0.000  loss_rpn_loc: 0.009  time: 2.2511  data_time: 0.0090  lr: 0.000979  max_mem: 8426M
WARNING [09/04 19:06:27 d2.data.datasets.coco]: 
Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

[09/04 19:06:27 d2.data.datasets.coco]: Loaded 10 images in COCO format from /content/valid/_annotations.coco.json
[09/04 19:06:27 d2.data.common]: Serializing 10 elements to byte tensors and concatenating them all ...
[09/04 19:06:27 d2.data.common]: Serialized dataset takes 0.00 MiB
[09/04 19:06:27 d2.evaluation.evaluator]: Start inference on 10 images
[09/04 19:06:29 d2.evaluation.evaluator]: Total inference time: 0:00:01.069654 (0.213931 s / img per device, on 1 devices)
[09/04 19:06:29 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:00 (0.198577 s / img per device, on 1 devices)
[09/04 19:06:29 d2.evaluation.coco_evaluation]: Preparing results for COCO format ...
[09/04 19:06:29 d2.evaluation.coco_evaluation]: Saving results to coco_eval/coco_instances_results.json
[09/04 19:06:29 d2.evaluation.coco_evaluation]: Evaluating predictions ...
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.01s).
Accumulating evaluation results...
DONE (t=0.01s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.448
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.723
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.508
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.448
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.407
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.493
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.493
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.493
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
[09/04 19:06:29 d2.evaluation.coco_evaluation]: Evaluation results for bbox: 
|   AP   |  AP50  |  AP75  |  APs   |  APm  |  APl  |
|:------:|:------:|:------:|:------:|:-----:|:-----:|
| 44.818 | 72.343 | 50.825 | 44.818 |  nan  |  nan  |
[09/04 19:06:29 d2.evaluation.coco_evaluation]: Note that some metrics cannot be computed.
[09/04 19:06:29 d2.evaluation.coco_evaluation]: Per-category bbox AP: 
| category   | AP   | category   | AP     |
|:-----------|:-----|:-----------|:-------|
| Car        | nan  | Car        | 44.818 |
[09/04 19:06:29 d2.engine.defaults]: Evaluation results for my_dataset_val in csv format:
[09/04 19:06:29 d2.evaluation.testing]: copypaste: Task: bbox
[09/04 19:06:29 d2.evaluation.testing]: copypaste: AP,AP50,AP75,APs,APm,APl
[09/04 19:06:29 d2.evaluation.testing]: copypaste: 44.8185,72.3432,50.8251,44.8185,nan,nan
[09/04 19:06:29 d2.utils.events]:  eta: 0:00:02  iter: 999  total_loss: 0.361  loss_cls: 0.054  loss_box_reg: 0.304  loss_rpn_cls: 0.001  loss_rpn_loc: 0.008  time: 2.2494  data_time: 0.0085  lr: 0.000999  max_mem: 8426M
[09/04 19:06:29 d2.engine.hooks]: Overall training speed: 997 iterations in 0:37:24 (2.2516 s / it)
[09/04 19:06:29 d2.engine.hooks]: Total training time: 0:37:33 (0:00:08 on hooks)
In [ ]:
# Look at training curves in tensorboard:
%load_ext tensorboard
%tensorboard --logdir output
The tensorboard extension is already loaded. To reload it, use:
  %reload_ext tensorboard

Evaluation

In [ ]:
#test evaluation
cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.85 # set the test threshold
predictor = DefaultPredictor(cfg)
evaluator = COCOEvaluator("my_dataset_test", cfg, False, output_dir="./output/")
val_loader = build_detection_test_loader(cfg, "my_dataset_test")
inference_on_dataset(trainer.model, val_loader, evaluator)
WARNING [09/04 19:08:05 d2.data.datasets.coco]: 
Category ids in annotations are not in [1, #categories]! We'll apply a mapping for you.

[09/04 19:08:05 d2.data.datasets.coco]: Loaded 40 images in COCO format from /content/test/_annotations.coco.json
[09/04 19:08:05 d2.data.common]: Serializing 40 elements to byte tensors and concatenating them all ...
[09/04 19:08:05 d2.data.common]: Serialized dataset takes 0.01 MiB
[09/04 19:08:05 d2.evaluation.evaluator]: Start inference on 40 images
[09/04 19:08:07 d2.evaluation.evaluator]: Inference done 11/40. 0.1975 s / img. ETA=0:00:05
[09/04 19:08:13 d2.evaluation.evaluator]: Inference done 36/40. 0.1998 s / img. ETA=0:00:00
[09/04 19:08:13 d2.evaluation.evaluator]: Total inference time: 0:00:07.126319 (0.203609 s / img per device, on 1 devices)
[09/04 19:08:13 d2.evaluation.evaluator]: Total inference pure compute time: 0:00:06 (0.199904 s / img per device, on 1 devices)
[09/04 19:08:13 d2.evaluation.coco_evaluation]: Preparing results for COCO format ...
[09/04 19:08:13 d2.evaluation.coco_evaluation]: Saving results to ./output/coco_instances_results.json
[09/04 19:08:13 d2.evaluation.coco_evaluation]: Evaluating predictions ...
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.06s).
Accumulating evaluation results...
DONE (t=0.01s).
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.326
 Average Precision  (AP) @[ IoU=0.50      | area=   all | maxDets=100 ] = 0.728
 Average Precision  (AP) @[ IoU=0.75      | area=   all | maxDets=100 ] = 0.204
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.329
 Average Precision  (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Precision  (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=  1 ] = 0.164
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets= 10 ] = 0.453
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=   all | maxDets=100 ] = 0.468
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.472
 Average Recall     (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.000
 Average Recall     (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = -1.000
[09/04 19:08:14 d2.evaluation.coco_evaluation]: Evaluation results for bbox: 
|   AP   |  AP50  |  AP75  |  APs   |  APm  |  APl  |
|:------:|:------:|:------:|:------:|:-----:|:-----:|
| 32.581 | 72.802 | 20.416 | 32.923 | 0.000 |  nan  |
[09/04 19:08:14 d2.evaluation.coco_evaluation]: Note that some metrics cannot be computed.
[09/04 19:08:14 d2.evaluation.coco_evaluation]: Per-category bbox AP: 
| category   | AP   | category   | AP     |
|:-----------|:-----|:-----------|:-------|
| Car        | nan  | Car        | 32.581 |
Out[ ]:
OrderedDict([('bbox',
              {'AP': 32.58125848458507,
               'AP50': 72.80156797550605,
               'AP75': 20.416164316726025,
               'APs': 32.92277916692182,
               'APm': 0.0,
               'APl': nan,
               'AP-Car': 32.58125848458507})])

Inference with the model saved weights

In [ ]:
%ls ./output/
coco_instances_results.json                       last_checkpoint
events.out.tfevents.1662305966.2bcf4c9aba09.76.0  metrics.json
events.out.tfevents.1662316132.2bcf4c9aba09.76.1  model_final.pth
instances_predictions.pth

Test the model on my custom dataset

In [ ]:
cfg.MODEL.WEIGHTS = os.path.join(cfg.OUTPUT_DIR, "model_final.pth")
cfg.DATASETS.TEST = ("my_dataset_test", )
cfg.MODEL.ROI_HEADS.SCORE_THRESH_TEST = 0.7   # set the testing threshold for this model
predictor = DefaultPredictor(cfg)
test_metadata = MetadataCatalog.get("my_dataset_test")

The desired algorithm

Convert image to green

In [ ]:
def convert_image_to_green(image: np.ndarray) -> np.ndarray:
    img_green = image.copy() # Create a copy of the image
    
    # Cancel out the other colors (Blue and Red)
    img_green[:, :, 0] = 0
    img_green[:, :, 2] = 0
    
    return img_green

paint bboxes with red

In [ ]:
def paint_bbox(image: np.ndarray, bbox: torch.tensor):
    image_with_boxes = image.copy() # Create a copy of the image
    total_area = 0

    for box in bbox: # All predictions in the image
        x0, y0, x1, y1 = box # top left & bottom right coords
        x0 = int(x0)
        y0 = int(y0)
        x1 = int(x1)
        y1 = int(y1)
        w = x1 - x0
        h = y1 - y0
        total_area += w*h
        points = np.array([[x0, y0], [x0, y1], [x1, y1], [x1, y0]])
        image_with_boxes = cv2.fillPoly(image_with_boxes, pts=[points], color=(0,0,255)) # paint the box red

    return image_with_boxes, total_area

The scan and paint algorithm

In [ ]:
def scan_and_paint(img: np.ndarray):
    res = img.copy()
    outputs = predictor(res) # predict bboxes for the image
    res = convert_image_to_green(res) #convert to green
    res, cars_area = paint_bbox(res, outputs["instances"].pred_boxes.tensor) # paint bboxes with red & get cars area in the image
    img_area = img.shape[0] * img.shape[1] 
    print(f'The cars area is: {cars_area:.3f}')
    print(f'The image area is: {img_area:.3f}')
    print(f'The cars percentage in the image is: {(cars_area/img_area):.3f}%')
    cv2_imshow(res)
In [ ]:
for imageName in glob.glob('/content/test/*jpg'):
    im = cv2.imread(imageName)
    scan_and_paint(im)
The cars area is: 927.000
The image area is: 173056.000
The cars percentage in the image is: 0.005%
The cars area is: 195.000
The image area is: 173056.000
The cars percentage in the image is: 0.001%
The cars area is: 0.000
The image area is: 173056.000
The cars percentage in the image is: 0.000%
The cars area is: 539.000
The image area is: 173056.000
The cars percentage in the image is: 0.003%
The cars area is: 571.000
The image area is: 173056.000
The cars percentage in the image is: 0.003%
The cars area is: 2464.000
The image area is: 173056.000
The cars percentage in the image is: 0.014%
The cars area is: 400.000
The image area is: 173056.000
The cars percentage in the image is: 0.002%
The cars area is: 765.000
The image area is: 173056.000
The cars percentage in the image is: 0.004%
The cars area is: 2221.000
The image area is: 173056.000
The cars percentage in the image is: 0.013%
The cars area is: 893.000
The image area is: 173056.000
The cars percentage in the image is: 0.005%
The cars area is: 679.000
The image area is: 173056.000
The cars percentage in the image is: 0.004%
The cars area is: 2765.000
The image area is: 173056.000
The cars percentage in the image is: 0.016%
The cars area is: 360.000
The image area is: 173056.000
The cars percentage in the image is: 0.002%
The cars area is: 2736.000
The image area is: 173056.000
The cars percentage in the image is: 0.016%
The cars area is: 210.000
The image area is: 173056.000
The cars percentage in the image is: 0.001%
The cars area is: 345.000
The image area is: 173056.000
The cars percentage in the image is: 0.002%
The cars area is: 196.000
The image area is: 173056.000
The cars percentage in the image is: 0.001%
The cars area is: 253.000
The image area is: 173056.000
The cars percentage in the image is: 0.001%
The cars area is: 2379.000
The image area is: 173056.000
The cars percentage in the image is: 0.014%
The cars area is: 970.000
The image area is: 173056.000
The cars percentage in the image is: 0.006%
The cars area is: 583.000
The image area is: 173056.000
The cars percentage in the image is: 0.003%
The cars area is: 1746.000
The image area is: 173056.000
The cars percentage in the image is: 0.010%
The cars area is: 2590.000
The image area is: 173056.000
The cars percentage in the image is: 0.015%
The cars area is: 569.000
The image area is: 173056.000
The cars percentage in the image is: 0.003%
The cars area is: 2443.000
The image area is: 173056.000
The cars percentage in the image is: 0.014%
The cars area is: 168.000
The image area is: 173056.000
The cars percentage in the image is: 0.001%
The cars area is: 1236.000
The image area is: 173056.000
The cars percentage in the image is: 0.007%
The cars area is: 492.000
The image area is: 173056.000
The cars percentage in the image is: 0.003%
The cars area is: 2982.000
The image area is: 173056.000
The cars percentage in the image is: 0.017%
The cars area is: 1393.000
The image area is: 173056.000
The cars percentage in the image is: 0.008%
The cars area is: 506.000
The image area is: 173056.000
The cars percentage in the image is: 0.003%
The cars area is: 934.000
The image area is: 173056.000
The cars percentage in the image is: 0.005%
The cars area is: 1532.000
The image area is: 173056.000
The cars percentage in the image is: 0.009%
The cars area is: 2755.000
The image area is: 173056.000
The cars percentage in the image is: 0.016%
The cars area is: 543.000
The image area is: 173056.000
The cars percentage in the image is: 0.003%
The cars area is: 984.000
The image area is: 173056.000
The cars percentage in the image is: 0.006%
The cars area is: 247.000
The image area is: 173056.000
The cars percentage in the image is: 0.001%
The cars area is: 2348.000
The image area is: 173056.000
The cars percentage in the image is: 0.014%
The cars area is: 480.000
The image area is: 173056.000
The cars percentage in the image is: 0.003%
The cars area is: 844.000
The image area is: 173056.000
The cars percentage in the image is: 0.005%
  • An option to see the images & predictions in RGB while using detectron2 instances drawing
In [ ]:
for imageName in glob.glob('/content/test/*jpg'):
    im = cv2.imread(imageName)
    outputs = predictor(im)
    v = Visualizer(im[:, :, ::-1],
                  metadata=test_metadata, 
                  scale=0.8
                  )
    out = v.draw_instance_predictions(outputs["instances"].to("cpu"))
    cv2_imshow(out.get_image()[:, :, ::-1])